Study of the impact of proper name transliteration on the performance of word alignment in French-Arabic parallel corpora (Etude de l'impact de la translittération de noms propres sur la qualité de l'alignement de mots à partir de corpus parallèles français-arabe) [in French]

نویسندگان

  • Nasredine Semmar
  • Houda Saadane
چکیده

Bilingual lexicons play a vital role in cross-language information retrieval and machine translation. The manual construction of these lexicons is often costly and time consuming. Word alignment techniques are generally used to construct bilingual lexicons from parallel texts. Aligning single words and nominal syntagms from parallel texts is relatively a well controlled task for languages using Latin script but it is complex when the source and target languages do not share the same written script. A solution to this issue consists in writing the proper names present in the parallel corpus in the same written script. This paper presents, on the one hand, a system for automatic transliteration of proper names from Arabic to Latin script, and on the other hand, a tool to align single and compound words from FrenchArabic parallel text corpora. We have evaluated the word alignment tool integrating transliteration using two methods: A manual evaluation of the alignment quality and an evaluation of the impact of this alignment on the translation quality by using the statistical machine translation system Moses. The obtained results show that transliteration of proper names from Arabic to Latin improves the quality of both alignment and translation. Mots-clés : Lexique bilingue, translittération, alignement de mots, traduction automatique statistique, évaluation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

فایل کامل مجلّه مطالعات زبان فرانسه دو فصلنامه علمی پژوهشی زبان فرانسه دانشکده زبانهای خارجی دانشگاه اصفهان

Tâ ÇÉÅ wx W|xâ Revue des Études de la Langue Française Revue semestrielle de la Faculté des Langues Étrangères de l'Université d'Ispahan Cinquième année, N° 8 Printemps-Eté 2013, ISSN 2008- 6571 ISSN électronique 2322-469X Cette revue est indexée dans: Ulrichsweb: global serials directory http://ulrichsweb.serialssolutions.com Doaj: Directory of Open Access Journals http://www.doaj.org ...

متن کامل

Transliteration as Alignment vs. Transliteration as Generation for Crosslingual Information Retrieval

Crosslingual Information Retrieval (CLIR) usually requires query translation and, due to named entities in the case of IR, query translation requires a good transliteration system when writing systems differ. Transliteration can be seen as a problem of generation or alignment. For IR, since we can extract a word list from the corpus being searched, it should be seen as an alignment problem. The...

متن کامل

Translating Chinese Romanized Name into Chinese Idiographic Characters via Corpus and Web Validation

Cross-language information retrieval performance depends on the quality of the translation resources used to pass from a user’s source language query to target language documents. Translation lists of proper names are rare but vital resources for cross-language retrieval between languages using different character sets. Named entities translation dictionaries can be extracted from bilingual cor...

متن کامل

Joseph de Maistre and Retributionist Theology

Joseph de Maistre is usually portrayed as Edmund Burke’s French counterpart, as they both wrote important treatises against the French Revolution. Although Maistre did share many of Burke’s conservative political views, he was much more than a political thinker. He was above all a religious thinker who interpreted political events through the prism of a particular retributionist theology. Accor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014